Picture for Haoning Wu

Haoning Wu

Kimi K2.5: Visual Agentic Intelligence

Add code
Feb 02, 2026
Viaarxiv icon

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models

Add code
Jan 28, 2026
Viaarxiv icon

Towards Pixel-Level VLM Perception via Simple Points Prediction

Add code
Jan 27, 2026
Viaarxiv icon

BabyVision: Visual Reasoning Beyond Language

Add code
Jan 10, 2026
Viaarxiv icon

SoccerMaster: A Vision Foundation Model for Soccer Understanding

Add code
Dec 11, 2025
Viaarxiv icon

VQualA 2025 Challenge on Visual Quality Comparison for Large Multimodal Models: Methods and Results

Add code
Sep 11, 2025
Figure 1 for VQualA 2025 Challenge on Visual Quality Comparison for Large Multimodal Models: Methods and Results
Figure 2 for VQualA 2025 Challenge on Visual Quality Comparison for Large Multimodal Models: Methods and Results
Figure 3 for VQualA 2025 Challenge on Visual Quality Comparison for Large Multimodal Models: Methods and Results
Figure 4 for VQualA 2025 Challenge on Visual Quality Comparison for Large Multimodal Models: Methods and Results
Viaarxiv icon

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Add code
May 29, 2025
Viaarxiv icon

Scaling-up Perceptual Video Quality Assessment

Add code
May 28, 2025
Viaarxiv icon

SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding

Add code
May 22, 2025
Viaarxiv icon

Multi-Agent System for Comprehensive Soccer Understanding

Add code
May 06, 2025
Viaarxiv icon